-
-
Notifications
You must be signed in to change notification settings - Fork 606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revise ABI-specific part of -preview=in #11828
Conversation
Thanks for your pull request and interest in making D better, @kinke! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please see CONTRIBUTING.md for more information. If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment. Bugzilla referencesYour PR doesn't reference any Bugzilla issue. If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog. Testing this PR locallyIf you don't have a local development environment setup, you can use Digger to test this PR: dub run digger -- build "master + dmd#11828" |
Pinging @Geod24 and @thewilsonator (as requested). |
// By ref because of size | ||
void testin2(in ulong[4] p) {} | ||
void testin2(in ulong[64] p) { static assert(__traits(isRef, p)); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Increasing the size just to be on the safe side - some ABIs might be able to pass a 32-bytes aggregate in registers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't seem wise to add a static assert here. You're forcing a behaviour when there is no spec on one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not spec'd, but if a 512-bytes arg is ever to be copied, then -preview=in
isn't implemented as intended.
Edit: Unless we really want to please the aliasing-paranoiacs, but even then, the compiler should be able to infer that the used lvalue cannot be aliased. Scratch that, even in that case, the param should still be a ref, only the callers might make an extra copy of the lvalue arg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean, passed in memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean passed by-value, i.e., the thing that is tested that it's not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, isRef would mean that no copy is made? There would still need to be spec on that though. From my reading of it, only non-trivial types must be passed by ref. Everything else either explicitly by value, or unenforceable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not spec'd, but do you see any ABI on the horizon where it would be advantageous to pass half a KB by value over passing a ref? If that's the case in a few decades, the test can be adapted, until then, it makes sure the implementation is up to the expectation of eliding expensive copies.
The Win64 failures are because of dmd/test/compilable/previewin.d Lines 22 to 42 in 37556c0
This raises the question whether |
f891220
to
dc09dfd
Compare
One thing to consider: At the moment,
Inference is not done at the moment, which is a big annoyance IMO. This is essentially issue 9423. In it, Kenji provides the rationale for it. diff --git a/src/dmd/expression.d b/src/dmd/expression.d
index 546937875..1e15b8ae3 100644
--- a/src/dmd/expression.d
+++ b/src/dmd/expression.d
@@ -3923,7 +3923,7 @@ extern (C++) final class FuncExp : Expression
auto tiargs = new Objects();
tiargs.reserve(td.parameters.dim);
- foreach (tp; *td.parameters)
+ foreach (idx, tp; *td.parameters)
{
size_t u = 0;
foreach (i, p; tf.parameterList)
@@ -3939,6 +3939,9 @@ extern (C++) final class FuncExp : Expression
Type t = pto.type;
if (t.ty == Terror)
return cannotInfer(this, to, flag);
+ // Set storage classes on the literal to match that of the
+ // definition.
+ tf.parameterList[idx].storageClass|= pto.storageClass;
tiargs.push(t);
} Although I need to have a look as whether or not it works with nested delegates. But if there's a difference in the |
Okay, covariance rules tightened to prevent this kind of implementation-dependent compile errors.
I wouldn't have expected to be able to use the prebuilt libs directly, without recompiling them. I don't think we should remove all possibly incompatible |
// -preview=in: add `ref` storage class to suited `in` params | ||
if (global.params.previewIn && (fparam.storageClass & (STC.in_ | STC.ref_)) == STC.in_) | ||
{ | ||
auto ts = t.baseElemOf().isTypeStruct(); | ||
const isPOD = !ts || ts.sym.isPOD(); | ||
if (!isPOD || target.preferPassByRef(t)) | ||
fparam.storageClass |= STC.ref_; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish I'd knew we'd go this way earlier :)
Personally I don't mind, but it goes against @tsbockman 's review and what was discussed in the forum thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I've always mentioned that it should be based on the param type only. And the implementation proposals here and for LDC make no use the of param position, and I doubt Iain will make use of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only part that might be interesting is the relationship of the in
parameter with other parameters, but that's something which is diagnosed much later, and not determinable from inspecting the function signature alone.
src/dmd/target.d
Outdated
* Returns true if the specified parameter type (a POD) should be passed by | ||
* ref for `in` params with `-preview=in`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep a somewhat more polished documentation ? At the very least this should be a Returns
section, but the fact that the parameter is only a POD could easily go in the Params
documentation, and the fact that this is only for -preview=in
should be more prominent.
Maybe you find this documentation redundant, but it is especially useful for people exploring the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly reworked.
Why not ? It's very possible to do, without much trouble, so why not ? |
There are probably many places in generic Phobos code where new |
There's currently no plan for D3 though. Also you can't even use it in generic code because it could subtly break if the compiler knows an instance was instantiated in another root module, and said root module was not compiled with |
Mixing code compiled with and without |
c65c451
to
ccd0dd9
Compare
Well, the problem is really that Phobos is unique in that it's shipped compiled. Other libraries on
Yes, that was very intentional, exactly so that any mismatch results in at least a linker error, and not a runtime corruption. Regarding Win64, wouldn't it be more efficient to change the D ABI to treat a slice as two parameters? The current situation, with two indirections, seems less than ideal. I assume this also applies to delegates ? |
There's no D ABI. ;) - LDC passes it as 2 separate args. But that's more of a kludge [because it would require more work when calling druntime hooks from the compiler]. Walter wants it to be equivalent to
Yes, but here LDC conforms with DMD, i.e., both pass it indirectly. Edit: These inefficiencies are exactly why |
In this particular case with its deep implications, going with the 'avoiding incompatibilities' sounds like the short-term solution, although care will need to be taken for future PRs too so as not to introduce regressions. The full potential can only be leveraged by revising the libs and shipping+linking a separate set of prebuilt libs, but that's a big task, and only feasible once enough people have been convinced about |
Some Win64 asm produced by DMD (this PR) for: void extReal(in real, in real);
void extSlice(in int[], in int[]);
void testReal(in real a, in real b)
{
extReal(a, b);
}
void testSlice(in int[] a, in int[] b)
{
extSlice(a, b);
}
_D7current8testRealFIKeIKeZv:
0000000000000000: 55 push rbp
0000000000000001: 48 8B EC mov rbp,rsp
0000000000000004: 48 83 EC 20 sub rsp,20h
0000000000000008: E8 00 00 00 00 call _D7current7extRealFIKeIKeZv
000000000000000D: 48 83 C4 20 add rsp,20h
0000000000000011: 5D pop rbp
0000000000000012: C3 ret
_D7current9testSliceFIKAiIKQeZv:
0000000000000000: 55 push rbp
0000000000000001: 48 8B EC mov rbp,rsp
0000000000000004: 48 83 EC 20 sub rsp,20h
0000000000000008: E8 00 00 00 00 call _D7current8extSliceFIKAiIKQeZv
000000000000000D: 48 83 C4 20 add rsp,20h
0000000000000011: 5D pop rbp
0000000000000012: C3 ret
_D7current8testRealFIeIeZv:
0000000000000000: 55 push rbp
0000000000000001: 48 8B EC mov rbp,rsp
0000000000000004: 48 83 EC 20 sub rsp,20h
0000000000000008: DB 2A fld tbyte ptr [rdx]
000000000000000A: DB 7D E0 fstp tbyte ptr [rbp-20h]
000000000000000D: 48 8D 55 E0 lea rdx,[rbp-20h]
0000000000000011: DB 29 fld tbyte ptr [rcx]
0000000000000013: DB 7D F0 fstp tbyte ptr [rbp-10h]
0000000000000016: 48 8D 4D F0 lea rcx,[rbp-10h]
000000000000001A: 48 83 EC 20 sub rsp,20h
000000000000001E: E8 00 00 00 00 call _D7current7extRealFIeIeZv
0000000000000023: 48 83 C4 20 add rsp,20h
0000000000000027: 48 8B E5 mov rsp,rbp
000000000000002A: 5D pop rbp
000000000000002B: C3 ret
_D7current9testSliceFIAiIQdZv:
0000000000000000: 55 push rbp
0000000000000001: 48 8B EC mov rbp,rsp
0000000000000004: 48 83 EC 20 sub rsp,20h
0000000000000008: 48 8B 02 mov rax,qword ptr [rdx]
000000000000000B: 48 8B 52 08 mov rdx,qword ptr [rdx+8]
000000000000000F: 48 89 45 E0 mov qword ptr [rbp-20h],rax
0000000000000013: 48 89 55 E8 mov qword ptr [rbp-18h],rdx
0000000000000017: 48 8D 55 E0 lea rdx,[rbp-20h]
000000000000001B: 48 8B 01 mov rax,qword ptr [rcx]
000000000000001E: 48 8B 49 08 mov rcx,qword ptr [rcx+8]
0000000000000022: 48 89 45 F0 mov qword ptr [rbp-10h],rax
0000000000000026: 48 89 4D F8 mov qword ptr [rbp-8],rcx
000000000000002A: 48 8D 4D F0 lea rcx,[rbp-10h]
000000000000002E: 48 83 EC 20 sub rsp,20h
0000000000000032: E8 00 00 00 00 call _D7current8extSliceFIAiIQdZv
0000000000000037: 48 83 C4 20 add rsp,20h
000000000000003B: 48 8B E5 mov rsp,rbp
000000000000003E: 5D pop rbp
000000000000003F: C3 ret
_D7current8testRealFIKeIKeZv:
0000000000000000: 55 push rbp
0000000000000001: 48 8B EC mov rbp,rsp
0000000000000004: 48 83 EC 20 sub rsp,20h
0000000000000008: 48 89 4D 10 mov qword ptr [rbp+10h],rcx
000000000000000C: 48 89 55 18 mov qword ptr [rbp+18h],rdx
0000000000000010: 48 8B 55 18 mov rdx,qword ptr [rbp+18h]
0000000000000014: 48 8B 4D 10 mov rcx,qword ptr [rbp+10h]
0000000000000018: E8 00 00 00 00 call _D7current7extRealFIKeIKeZv
000000000000001D: 48 8B E5 mov rsp,rbp
0000000000000020: 5D pop rbp
0000000000000021: C3 ret
_D7current9testSliceFIKAiIKQeZv:
0000000000000000: 55 push rbp
0000000000000001: 48 8B EC mov rbp,rsp
0000000000000004: 48 83 EC 20 sub rsp,20h
0000000000000008: 48 89 4D 10 mov qword ptr [rbp+10h],rcx
000000000000000C: 48 89 55 18 mov qword ptr [rbp+18h],rdx
0000000000000010: 48 8B 55 18 mov rdx,qword ptr [rbp+18h]
0000000000000014: 48 8B 4D 10 mov rcx,qword ptr [rbp+10h]
0000000000000018: E8 00 00 00 00 call _D7current8extSliceFIKAiIKQeZv
000000000000001D: 48 8B E5 mov rsp,rbp
0000000000000020: 5D pop rbp
0000000000000021: C3 ret
_D7current8testRealFIeIeZv:
0000000000000000: 55 push rbp
0000000000000001: 48 8B EC mov rbp,rsp
0000000000000004: 48 83 EC 50 sub rsp,50h
0000000000000008: 48 89 5D D8 mov qword ptr [rbp-28h],rbx
000000000000000C: 48 89 4D 10 mov qword ptr [rbp+10h],rcx
0000000000000010: 48 89 55 18 mov qword ptr [rbp+18h],rdx
0000000000000014: 48 8B 45 18 mov rax,qword ptr [rbp+18h]
0000000000000018: DB 28 fld tbyte ptr [rax]
000000000000001A: DB 7D F0 fstp tbyte ptr [rbp-10h]
000000000000001D: 48 8D 55 F0 lea rdx,[rbp-10h]
0000000000000021: 48 8B 5D 10 mov rbx,qword ptr [rbp+10h]
0000000000000025: DB 2B fld tbyte ptr [rbx]
0000000000000027: DB 7D E0 fstp tbyte ptr [rbp-20h]
000000000000002A: 48 8D 4D E0 lea rcx,[rbp-20h]
000000000000002E: E8 00 00 00 00 call _D7current7extRealFIeIeZv
0000000000000033: 48 8B 5D D8 mov rbx,qword ptr [rbp-28h]
0000000000000037: 48 8B E5 mov rsp,rbp
000000000000003A: 5D pop rbp
000000000000003B: C3 ret
_D7current9testSliceFIAiIQdZv:
0000000000000000: 55 push rbp
0000000000000001: 48 8B EC mov rbp,rsp
0000000000000004: 48 83 EC 50 sub rsp,50h
0000000000000008: 48 89 5D D8 mov qword ptr [rbp-28h],rbx
000000000000000C: 48 89 4D 10 mov qword ptr [rbp+10h],rcx
0000000000000010: 48 89 55 18 mov qword ptr [rbp+18h],rdx
0000000000000014: 48 8B 45 18 mov rax,qword ptr [rbp+18h]
0000000000000018: 48 8B 50 08 mov rdx,qword ptr [rax+8]
000000000000001C: 48 8B 00 mov rax,qword ptr [rax]
000000000000001F: 48 89 45 F0 mov qword ptr [rbp-10h],rax
0000000000000023: 48 89 55 F8 mov qword ptr [rbp-8],rdx
0000000000000027: 48 8D 55 F0 lea rdx,[rbp-10h]
000000000000002B: 48 8B 5D 10 mov rbx,qword ptr [rbp+10h]
000000000000002F: 48 8B 4B 08 mov rcx,qword ptr [rbx+8]
0000000000000033: 48 8B 03 mov rax,qword ptr [rbx]
0000000000000036: 48 89 45 E0 mov qword ptr [rbp-20h],rax
000000000000003A: 48 89 4D E8 mov qword ptr [rbp-18h],rcx
000000000000003E: 48 8D 4D E0 lea rcx,[rbp-20h]
0000000000000042: E8 00 00 00 00 call _D7current8extSliceFIAiIQdZv
0000000000000047: 48 8B 5D D8 mov rbx,qword ptr [rbp-28h]
000000000000004B: 48 8B E5 mov rsp,rbp
000000000000004E: 5D pop rbp
000000000000004F: C3 ret |
How did you check the libs? Hacking the compiler, using some regex to wade through the source, checking mangled names in the libs, ...? |
Whats the status of this? |
druntime/Phobos need to be rechecked, i.e., there must be no existing |
Hacking the compiler and adding a printf when the promotion happens. |
This is the promised follow-up on dlang#11000 and makes DMD exploit the specifics of the few supported ABIs (Win64, SysV x86_64, 32-bit x86). It also almost perfectly matches the proposed LDC implementation in ldc-developers/ldc#3578 (just a minor divergence for Win64 and dynamic arrays, but in that point the LDC and DMD Win64 ABI diverges in general).
Require the `in` storage class on both sides, just like ref/out/lazy. Doing so makes sure the code compiles with all compilers and for all targets, independent from whether `in` means `const scope` or `const scope ref`.
There are apparently many |
With Win64 special cases in place for slices and delegates, druntime seems fine for all targets, and Phobos should be after dlang/phobos#7687. |
Confirmed, libs are fine now. |
Great! My main concern was really covariance with |
Thx. - The output contained almost 10k warnings related to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Few comments though:
-
I suppose you want to get rid of the 3rd and 5th commits (the latter being a revert for the former).
-
I'm not so found of tightening the covariance rules, as I wanted to make it as easy to use as possible. If a lib doesn't use
in
, you can still usein
in your code, and pass a delegate that acceptsin
. However, with the current changes, this possibility is gone. Note that this was mostly important for slices. But I don't think it's a dealbreaker, so let's give it a try. -
Can you merge the 4th commit into the first, and move the special case from Win64 to
typesem
. IMO it's much clearer to the reader to know what happens based on types than on sizes of types, which is why the original version had those:
else if (p.type.ty == Tarray)
continue;
// Pass delegates by value to allow covariance
// Function pointers are a single pointers and handled below.
else if (p.type.ty == Tdelegate)
continue;
Also added dlang/ci#437 |
|
Regarding 3, okay that makes sense. |
That would require a bit of refactoring to get rid of the ugly code duplication -> both commits removed instead. ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I expect this to be slightly more performant than the previous behavior, where a delegate was treated like a corresponding struct, passed via hidden pointer and returned via sret. The primary motivation is a smooth preparation for PR ldc-developers#3578 - in order to allow people to experiment with `-preview=in` without recompiling druntime and Phobos, `in` slices and delegates must not be passed by-ref with `-preview=in` (see dlang/dmd#11828). This would have required a special case for delegates on Win64, which is IMO better handled this way.
I expect this to be slightly more performant than the previous behavior, where a delegate was treated like a corresponding struct, passed via hidden pointer and returned via sret. The primary motivation is a smooth preparation for PR ldc-developers#3578 - in order to allow people to experiment with `-preview=in` without recompiling druntime and Phobos, `in` slices and delegates must not be passed by-ref with `-preview=in` (see dlang/dmd#11828). This would have required a special case for delegates on Win64, which is IMO better handled this way.
…ers (#3609) I expect this to be slightly more performant than the previous behavior, where a delegate was treated like a corresponding struct, passed via hidden pointer and returned via sret. The primary motivation is a smooth preparation for PR #3578 - in order to allow people to experiment with `-preview=in` without recompiling druntime and Phobos, `in` slices and delegates must not be passed by-ref with `-preview=in` (see dlang/dmd#11828). This would have required a special case for delegates on Win64, which is IMO better handled this way.
This is the promised follow-up on #11000 and makes DMD exploit the specifics of the few supported ABIs (Win64, SysV x86_64, 32-bit x86). It also almost perfectly matches the proposed LDC implementation in ldc-developers/ldc#3578 (just a minor divergence for Win64 and dynamic arrays, but in that point the LDC and DMD Win64 ABI diverges in general).